Biostatistics For Dummies (Monika Wahi John Pezzullo)

Here are some common distributions that describe the random fluctuations found in data analyzed by

biostatisticians:

Normal: The familiar, bell-shaped, normal distribution is probably the most common distribution

you will encounter. As an example, systolic blood pressure (SBP) is found to follow a normal

distribution in human populations.

Log-normal: The log-normal distribution is also called a skewed distribution. This distribution

describes many laboratory results, such as enzymes and antibody titers, where most of the

population tests on the low end of the scale. It is also the distribution seen for lengths of hospital

stays, where most stays are 0 or 1 days, and the rest are longer.

Binomial: The binomial distribution describes proportions, and represents the likelihood that a

value will take one of two independent values (as whether an event occurs or does not occur). As

an example, in a class held regularly where students can only pass or fail, the proportion who fail

will follow a binomial distribution.

Poisson: The Poisson distribution describes the number of occurrences of sporadic random events

(rather than the binomial distribution, which is for more common events). Examples of where the

Poisson distribution is used in biostatistics is where the events are not as common, such as deaths

from specific cancers each year.

Chapter 24 describes these and other distribution functions in more detail, and you also encounter them

throughout this book.

Distributions important to statistical testing

Some probability distributions don’t describe fluctuations in data values but instead describe

fluctuations in calculated values as part of a statistical test (when you are calculating what’s called a

test statistic). Distributions of test statistics include the Student t, chi-square, and Fisher F

distributions. Test statistics are used to obtain the p values that result from the tests. See “Getting the

language down” later in this chapter for a definition of p values.

Introducing Statistical Inference

Statistical inference is where you draw conclusions (or infer) about a population based on

estimations from a sample from that population. The challenge posed by statistical inference theory is

to extract real information from the noise in our data. This noise is made up of these random

fluctuations as well as measurement error. This very broad area of statistical theory can be subdivided

into two topics: statistical estimation theory and statistical decision theory.

Statistical estimation theory

Statistical estimation theory focuses how to improve the accuracy and precision of metrics calculated

from samples. It provides methods to estimate how precise your measurements are to the true

population value, and to calculate the range of values from your sample that’s likely to include the true

population value. The following sections review the fundamentals of statistical estimation theory.

Accuracy and precision